Add the bibtex entry in the .bib file. You can find the entries in Google scholar, but double check since it is not always correct.
Call the citations in the text:
Citation within parentheses (Aust and Barth 2020)
Multiple citations (Aust and Barth 2020; R Core Team 2021)
In-text citations Aust and Barth (2020)
Year only (2021)
Only if your citation appears in the text it will also show up in the Reference list. Don’t manually modify the Reference list.
(150 words) – 0.3 POINTS Summarize the report. Write this as the very last thing.
What is the main topic you are addressing?
what are your research questions and hypotheses?
what are your results and the main conclusion?
(about 1000 words) – 0.5 POINTS Place your topic of choice within the existing literature and explain what you are going to address in this report and why. - What is the main topic that is going to be studied in this paper? - Why is it important?
Humanity has always moved a lot, for many different reasons, such as work or study opportunities, for safety of to join family. Because there are so many factors that play a role in migration, a lot of research has already been performed on this topic. However, at the same time, there are still a lot of thing unknown, which is why this paper will elaborate further on the topic of migration. In this paper, the focus will be specifically on migration within Europe with the use of social network analysis.
As mentioned, humanity has always moved a lot, however the amount of migration has increased with time. In 2020, it was estimated that 281 million people live in a country that is not their country of birth (McAuliffe and Khadria 2019). This is over three times the estimated number in 1970. With this increase in migration, come a lot of other changes, since migration has different effects on a country. For example, research has shown that skilled migrants have a positive effect on innovation (Fassio, Montobbio, and Venturini 2019). Besides that there are effects on the economy, since migration “boosts the working-age population” and “contributes to technological progress” (OECD 2014).
Since it migration has many different effects, it is important to know why people move. This way, it can be determined which countries will be more affected by migration and the effects of it can be anticipated. In line with this reasoning, Dustmann and Frattini (2011) found differences in the rate of European born immigrants between different European nations. An example they give is Ireland, where 70% of the immigrants are from Europe, compared to UK where only 21% are from Europe.
Some researchers have already looked at the reasons why people migrate to other countries. The European Parliament has determined three push and pull factors for leaving a country, namely the social-political factors, demographic and economic factors and environmental factors (European Parliament 2020).
Socio-political factors can be described as persecution, politics (Lam 2002) or culture (Kontuly and Smith 1995). Davenport et al. (2003) indeed find that threats to personal integrity are the most important reason why people migrate. Demographic and economic factors are labor standards, unemployment (DaVanzo 1978) and the country’s overall economic health (Czaika 2015). Finally environmental factors can be fleeing of natural disasters (Drabo and Mbaye 2015; Berlemann and Steinhardt 2017) and possibly in the future climate change. Finally an important factor to keep in mind is technology, and technological development. Because of technological progress, jobs are created as well as destructed (Mortensen and Pissarides 1998). According to Hillmann et al. (2020) this loss of jobs in certain segments of the labor market can cause people to move elsewhere. Moreover, when the country of origin lacks technological progress, outmigration is more attractive.
To be able to answer the question of why people migrate, two different questions need to be answered. First, it needs to be determined through which people migrate and whether this changes over time. With this information, possible travel paths can be determined, which can in turn help in explaining why and how people migrate. Research has determined that “physical distance and border effects are significant predictors of migration flows among OECD countries” (Tranos, Gheasi, and Nijkamp 2015). Here physical distance refers to how close countries are to each other. When looking at Europe, this would mean that most migrants would travel to our via the middle point of Europe, which is Belarus or Lithuania. Based on boarder effect, Austria would be an interesting node for centrality, since it shares borders with eight different countries. Another interesting approach is looking at the migration routes and flows in Europe, such as the Western African Route, Western Mediterranean Route, Central Mediterranean Route, Western Balkan Route, Eastern Mediterranean Route, Eastern Borders Route and the The Channel Route (Frontex n.d.). Based on these migration routes, it appears that Austria plays an important role for migration within Europe (Guggenheim 2019; Ambrosini 2016). Based on this information the following hypothesis can be formulated:
Besides looking at the network itself, it is important to also take into account endogenous and exogenous variables. As mentioned before, many different factors have been found that impact migration. In this research all of these factors will be considered, and since technical development is becoming more and more important, we hypothesize that:
As mentioned before, this study make use of social network analysis, so a network has been created for different countries in Europe. In the network, each country is a different node and each edge is a stream of migration. These edges are also weighted, where the weights represent the amount of people migrating from one country to another.
To decide whether Austria has a high centrality, the stress centrality measure was used. This measure defines the number of geodesics that pass through a country. The country with the highest number will be the most central country. To check hypothesis 1, the research has focused on the stress centrality of Austria in each decade. With that measure, a CUG test with 2000 simulations was done. This helps to define whether the centrality of Austria is significant compared to what one would expect in a network with the same number of vertices and edges as in the observed network.
So far, the approach used in this research has not been used. By combining network analysis with many different variables, the impact of the variables can be compared and a most important factor can be determined. As mentioned in the introduction, the results from this research can help countries understand how many migrants are coming to this country, and possibly also why they are coming. With this information, appropriate accommodations can be made for them and the impact of having these migrants can be determined.
(about 500 words) 1 POINT (+ BONUS) * Which data set are you going to use? Three options:
o Use readily/easily available data (0 bonus points)
o Combine two or more existing datasets (max 0.5 bonus points)
o Scrape or collect your own data (max. 1 bonus point)
o Who collected the data?
o What is the source?
o When was the data produced?
o How was the data collected?
Provide descriptive measures of your data (tables, plots, etc.)
Why is this data useful to study your topic and answer your research questions?
What is the potential bias in the data? How does this affect your results?
The data used for the research project consists out of several publicly accessible datasets retrieved from the World Bank (The World Bank Group 2022). These datasets are:
Global Bilateral Migration: Global matrices of bilateral migrant stocks spanning the period 1960-2000, disaggregated by gender and based primarily on the foreign-born concept. Gathered through various censuses. Last updated on 06-28-2011.
World Development Indicators: World development indicators, compiled from officially recognized international sources spanning the period 1960-2000. Last updated on 09-16-2022.
| Column Number | Column Name |
|---|---|
| 1 | Time |
| 2 | Time Code |
| 3 | Country Name |
| 4 | Country Code |
| 5 | Air transport, registered carrier departures worldwide [IS.AIR.DPRT] |
| 6 | Alternative and nuclear energy (% of total energy use) [EG.USE.COMM.CL.ZS] |
| 7 | Cereal production (metric tons) [AG.PRD.CREL.MT] |
| 8 | Electric power consumption (kWh per capita) [EG.USE.ELEC.KH.PC] |
| 9 | Employers, total (% of total employment) (modeled ILO estimate) [SL.EMP.MPYR.ZS] |
| 10 | Fixed telephone subscriptions [IT.MLT.MAIN] |
| 11 | GDP growth (annual %) [NY.GDP.MKTP.KD.ZG] |
| 12 | Individuals using the Internet (% of population) [IT.NET.USER.ZS] |
| 13 | Life expectancy at birth, total (years) [SP.DYN.LE00.IN] |
| 14 | Medium and high-tech exports (% manufactured exports) [TX.MNF.TECH.ZS.UN] |
| 15 | Medium and high-tech manufacturing value added (% manufacturing value added) [NV.MNF.TECH.ZS.UN] |
| 16 | Mobile cellular subscriptions [IT.CEL.SETS] |
| 17 | Mortality rate, adult, female (per 1,000 female adults) [SP.DYN.AMRT.FE] |
| 18 | Mortality rate, adult, male (per 1,000 male adults) [SP.DYN.AMRT.MA] |
| 19 | People using safely managed drinking water services (% of population) [SH.H2O.SMDW.ZS] |
| 20 | Scientific and technical journal articles [IP.JRN.ARTC.SC] |
| 21 | Secure Internet servers [IT.NET.SECR] |
| 22 | Survival to age 65, female (% of cohort) [SP.DYN.TO65.FE.ZS] |
| 23 | Survival to age 65, male (% of cohort) [SP.DYN.TO65.MA.ZS] |
| 24 | Technicians in R&D (per million people) [SP.POP.TECH.RD.P6] |
To answer the research question data regarding migration patterns between European nations are necessary. Additional information which might possibly influence said migration is also required. The combined dataset contains information about multiple nations including all European nations. As seen in table various characteristics of the European nations throughout the decades in the late 20th century are also available. This provides the opportunity to analyse the effect of changes within the European nations over the years on migration.
GDP growth in the Netherlands
Telephone subscriptions in Poland
Mortality rate of males in Ireland
The oldest data points stem from 1960, from 1960 onwards European nations have seen quite a number of changes. Some nations within the dataset did not exist in their current condition across all of the years existent in the dataset, Germany for example only exists as Germany whereas for most of the time period represented in the dataset Germany was actually two countries (West Germany and East Germany). This could mean that for some nations the data itself is not completely accurate as the numbers existent in the dataset are a sum of the numbers for both nations. This might skew the data and results.
(about 500 words) – 1 POINTS * Why are these two methods suitable for your data?
Why are these two methods suitable for your research questions?
Are there other methods to address these questions? If yes, why are the methods you chose better for this case?
The stress centrality measure in combination with the CUG tests help to find an answer on whether Austria is a central country in Europe’s migrant streams or not. The stress centrality measure fits this purpose because it considers incoming as well as outgoing edges. So, in that way, it simulates migrant streams through a country. Furthermore, this measure does not consider streams that end or start in Austria. Therefore, one can really see whether Austria is the central player, or in other words, the country that ‘divides’ all migrants through Europe and the country that incoming migrants use to get to other countries in Europe. The CUG tests help to see how high the average centrality in a network with a similar number of edges and vertices would ‘normally’ be, compared to the observed network. Therefore, one can argue whether Austria is a more central country than expected or not.
Looking at the stress centrality measure, one could also have chosen for numerous other centrality measures. For instance, the eccentricity measure determines the maximum number of steps needed to reach any other country in the network. The country with the minimum eccentricity would be the most central country. However, this does not fit the purpose of this research since the measure only considers outgoing edges. The Shapley centrality would have been a good measure since it can also consider the weights, so the intensity of a migrant stream (Michalak et al. 2013). However, this method is currently not applied in R.
(about 2000 words)
This research has taken two different approaches to check the hypotheses. Therefore, two models were created. The models and their results are presented and discussed in the sections below.
(about 1000 words) – 2.5 POINTS
Present your results appropriately (plots, tables…) and discuss your findings in plain English
Discuss the meaning of your findings in relation to your hypothesis. (half of the points evaluated in this other part)
The centrality of Austria was estimated by the stress centrality measure. The centrality was calculated for every country in the network, in each decade that is represented in the dataset. The results of those centrality calculations for every country can be seen in the table below.
| Country | Centrality_1960 | Centrality_1970 | Centrality_1980 | Centrality_1990 | Centrality_2000 |
|---|---|---|---|---|---|
| Albania | 44 | 30 | 24 | 57 | 13 |
| Austria | 162 | 147 | 79 | 84 | 97 |
| Belarus | 58 | 1 | 44 | 29 | 2 |
| Belgium | 115 | 85 | 99 | 76 | 75 |
| Bosnia and Herzegovina | 70 | 42 | 53 | 75 | 83 |
| Bulgaria | 54 | 42 | 36 | 36 | 57 |
| Croatia | 59 | 29 | 35 | 94 | 53 |
| Cyprus | 29 | 23 | 10 | 8 | 6 |
| Czech Republic | 69 | 90 | 122 | 106 | 87 |
| Denmark | 107 | 74 | 74 | 67 | 56 |
| Estonia | 4 | 10 | 4 | 32 | 79 |
| Finland | 35 | 74 | 69 | 66 | 83 |
| France | 162 | 147 | 81 | 70 | 97 |
| Georgia | 4 | 5 | 4 | 7 | 2 |
| Germany | 162 | 147 | 109 | 95 | 97 |
| Greece | 121 | 113 | 79 | 95 | 87 |
| Hungary | 142 | 94 | 79 | 76 | 67 |
| Iceland | 0 | 1 | 2 | 6 | 24 |
| Ireland | 53 | 41 | 26 | 24 | 44 |
| Italy | 162 | 147 | 109 | 76 | 97 |
| Latvia | 6 | 7 | 5 | 32 | 157 |
| Liechtenstein | 0 | 1 | 3 | 2 | 18 |
| Lithuania | 12 | 10 | 5 | 32 | 137 |
| Luxembourg | 33 | 55 | 53 | 69 | 61 |
| Macedonia, FYR | 68 | 44 | 50 | 92 | 70 |
| Monaco | 1 | 1 | 6 | 10 | 2 |
| Netherlands | 147 | 147 | 109 | 32 | 64 |
| Norway | 39 | 80 | 71 | 59 | 44 |
| Poland | 575 | 437 | 474 | 439 | 129 |
| Portugal | 21 | 21 | 48 | 33 | 6 |
| Romania | 65 | 49 | 44 | 30 | 46 |
| Russian Federation | 61 | 25 | 65 | 36 | 92 |
| Slovak Republic | 70 | 84 | 105 | 90 | 56 |
| Slovenia | 15 | 13 | 16 | 24 | 17 |
| Spain | 94 | 88 | 78 | 41 | 45 |
| Sweden | 127 | 118 | 109 | 56 | 97 |
| Switzerland | 147 | 147 | 109 | 95 | 97 |
| Ukraine | 57 | 32 | 145 | 154 | 92 |
| United Kingdom | 81 | 147 | 109 | 46 | 87 |
The results for Austria, compared to other countries, show that Austria was at the higher end of the most central countries in Europe in 1960 and 1970. In later decades, Austria seems to have an average stress centrality in Europe. The results from to the table are plotted below to get a better view on the (most) central countries for migrant streams in Europe. The vertex size corresponds with the stress centrality of a country (vertex). The edge width corresponds with the edge weight, i.e., the number of migrants in that decade.
The plots show indeed that as of 1980, there are more countries within Europe that are more central than Austria regarding migrant streams. Poland seems to have the highest stress centrality in all decades except for 2000. In that decade, the differences are not that clear. In every decade, Poland has clear links with Germany (and France) and with Ukraine (and Belarus). That shows that Poland acts as a central player in the migrant stream from east to west or west to east. For Austria, those links are not that clear from the plot. The only clear link is the one with Germany but that doesn’t show centrality of Austria. The migration routes that were shown by Guggenheim (2019), cannot be seen in these observations. However, looking at the color distributions of all plots, Austria is always a country that has a higher centrality than many other countries in Europe. All in all, the plots give an indication that Austria is not the country with the highest stress centrality in Europe, but does not belong to the lower end either.
However, the plots do not give a statistically decisive judgement about the significance of Austria’s centrality. To get a better view of on that, the CUG tests are used. For each decade, the stress centrality of Austria is compared with the average expected centrality for a network with the same number of vertices and edges. Each CUG test has simulated 2000 similar networks. The average stress centrality was stored for every simulation. The empirical results can be seen below for every decade, where the simulations are represented by the density plot and the stress centrality of Austria is represented by the dashed line.
In every decade, the stress centrality of Austria seems to be exceptionally lower than networks with a similar number of vertices and directed edges. To quantify how exceptional this difference is, the proportion of stress centralities that are higher than the stress centrality of Austria were calculated for every decade. The results can be seen below.
| Year | Proportion of centralities higher than Austria |
|---|---|
| 1960 | 1 |
| 1970 | 1 |
| 1980 | 1 |
| 1990 | 1 |
| 2000 | 1 |
So, 100% of all networks with the same number of vertices and directed edges have a higher stress centrality than Austria has, in every decade that has been analyzed. With that, it can be said that the stress centrality of Austria is statistically significantly lower than what would have been expected for a network with this number of vertices and directed edges. Hypothesis 1 predicted that Austria has a high centrality in the social network of Europe’s migration. The results in the table above do not support this hypothesis. Therefore, this hypothesis is rejected. Whereas Austria belongs to the higher end of the centralities of European countries, considering the stress centrality plots, the number is expected to be higher for a network that is similar to the European migrant network.
Note that the weights of the edges are not included in the analyses, since that was not possible in the current implementation of the theory in R. Including the weights of the edges in the centrality analysis might give different results for the centrality of Austria. The weights might give a better indication of how many migrants ‘pass’ through a country to get to their destination within Europe.
(about 1000) – 2.5 POINTS
Present your results appropriately (plots, tables…) and discuss your findings in plain English
Discuss the meaning of your findings in relation to your hypothesis. (half of the points evaluated in this other part)
Option 1:
| Model 1 | |
| (Intercept) | 5.03 *** |
| (0.22) | |
| groupTrt | -0.37 |
| (0.31) | |
| R^2 | 0.07 |
| Adj. R^2 | 0.02 |
| Num. obs. | 20 |
Option 2
| Model 1 | Model 2 | |
| (Intercept) | 5.03 *** | |
| (0.22) | ||
| groupTrt | -0.37 | 4.66 *** |
| (0.31) | (0.22) | |
| groupCtl | 5.03 *** | |
| (0.22) | ||
| R^2 | 0.07 | 0.98 |
| Adj. R^2 | 0.02 | 0.98 |
| Num. obs. | 20 | 20 |
Option 3
## Model: bars denote standard errors (95%).
Option 4
## Models: bars denote standard errors (95%).
(about 350 words) – 0.7 POINTS What were your topic and research questions again? (1 sentence)
What did you learn from the two analysis you run? *** most important point to address 0.5 POINTS here
Who benefits from your findings?
What does remain an open problem?
Can you give suggestions for future work in this area?